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Abstract 

Not  all  virtual  reality  applications  today  require  the 
power  or  expense  of  single  large  visualization  “super¬ 
computers”.  Factors  such  as  frame  rate  and  polygon 
count  have  a  major  impact  upon  the  performance  of  a 
VR  application.  Increasingly,  low  cost  commodity  con¬ 
sumer  electronics  and  computing  technology  are  becom¬ 
ing  powerful  enough  to  present  an  acceptable  level  of 
graphics  performance.  Already,  commodity  PCs  are 
driving  virtual  reality  workbenches  with  stereo  and 
tracking  options.  The  next  step  is  to  have  them  drive 
multi-screen  environments. 

We  present  an  experiment  motivated  by  the  low  cost 
per  performance  of  PC  commodity  clusters.  The  experi¬ 
ment  is  to  replace  a  visualization  super-computing  plat¬ 
form  driving  a  4-wall  immersive  display  system  [1  ]  with 
a  PC  commodity  cluster.  We  describe  the  system,  imple¬ 
mentation  and  experimental  testing  in  the  paper. 

1.  Introduction 

The  cost  per  performance  of  PC  commodity  clusters  is 
rapidly  becoming  a  viable  alternative  to  traditional  high- 
end  visualization  supercomputers.  Why  should  anyone 
use  a  commodity  cluster  when  the  capability  already  ex¬ 
ists  with  solutions  from  SGI  [9]  and  Sun  [11].  Consumer 
electronics  and  computing  technology  has  evolved  at  an 
astounding  rate.  This  rapid  evolution  has  both  driven 
down  costs  and  accelerated  obsolescent  computing  cycles. 
A  general  rule  to  follow  for  buying  graphics  capability 
from  SGI  is  to  budget  $250,000  per  graphics  pipe.  For 
those  who  cannot  afford  to  outfit  an  Onyx  class  computer 
with  four  graphics  pipes,  extra  raster  managers  may  be 
added  to  two  pipes  to  drive  a  4-wall  display  system.  In 
contrast,  our  experimental  cluster  costs  less  than  $1000 
per  node.  With  the  addition  of  a  video  matrix  switcher, 
the  grand  total  was  less  than  $15,000.  Even  with  the  fast 
obsolescent  cycles  the  price  difference  is  so  great,  that  an 
organization  could  afford  to  replace  or  upgrade  the 
graphics  clusters  many  times.  Another  advantage  of  PCs 
is  the  wide  availability  of  low  cost  parts,  which  can  be 
used  for  repairs  and  upgrades.  Overall,  the  PCs  are  cost 
effective,  powerful  and  flexible. 

We  present  an  experiment  that  integrates  a  commodity 
cluster  into  an  existing  4-wall  display  system — a  Sur¬ 


round-Screen  Visualization  System  (SSVR)  [2]  from 
Mechdyne  Corporation.  The  objective  is  to  attain  active 
stereo  visualization  on  multiple  walls  using  genlocking, 
swap-locking  and  data-locking  capabilities. 

High-end  visualization  supercomputers  offer  multi¬ 
wall,  active  stereo  visualization  packaged  together.  Ste¬ 
reo  presentation  and  coordination  of  scene  graph  data  is 
automatically  taken  care  of  by  the  computer  in  hardware 
or  by  the  invocation  of  proprietary  software  libraries. 
The  cluster  was  designed  from  the  beginning  to  attempt 
to  replace  aging  SGI  computing  equipment  used  to  drive 
our  current  4-wall  display  system.  We  are  finding  our¬ 
selves  taxing  the  capabilities  of  an  Onyx  2  system  with 
Infinite  Reality  2  graphics  by  demanding  increasing 
numbers  of  polygons  to  be  rendered  while  needing  a 
fixed  frame  rate  for  active  stereo.  When  implementing  a 
cluster,  these  issues  must  be  dealt  with  in  order  to  pro¬ 
duce  a  coherent  scene  across  many  screens.  We  describe 
the  system,  implementation  and  experimental  testing  in 
the  paper. 

2.  System  Overview 

We  present  the  design  strategies,  system  requirements 
and  details  of  our  system  here. 

2.1  Cluster  Design  Strategies 

Communication  between  the  cluster  nodes  is  vital.  Data 
such  as  pixels,  geometric  primitives,  or  even  scene  graph 
data  is  passed  among  the  nodes.  The  way  data  is  handled 
and  the  type  of  data  passed  greatly  impacts  the  network 
bandwidth  requirements  of  the  cluster.  Two  basic  ap¬ 
proaches  for  setting  up  a  graphics  clustering  communica¬ 
tion  software  architecture  are:  Client/Server  and  Mas¬ 
ter/Slave  [3].  Each  method  along  with  its  benefits  and 
disadvantages  is  described  in  more  detail  below. 

Client/Server:  The  Client/Server  approach  consists  of  a 
single  node  cluster  that  serves  data  to  the  graphics  ren¬ 
dering  clients.  The  advantage  to  this  arrangement  is 
many  applications  may  embed  a  server  that  works  with 
the  same  rendering  client  nodes.  This  environment  is 
very  flexible.  The  disadvantage  is  a  higher  consumption 
of  network  bandwidth.  Most  Client/Server  clusters  rely 


—  1  — 


Report  Documentation  Page 


Form  Approved 
OMB  No.  0704-0188 


Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 
VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  OMB  control  number. 


1.  REPORT  DATE 

2002 


2.  REPORT  TYPE 


3.  DATES  COVERED 

00-00-2002  to  00-00-2002 


4.  TITLE  AND  SUBTITLE 

Integration  of  a  Commodity  Cluster  Into  an  Existing  4-Wall  Display 
System 

6.  AUTHOR(S) 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Virtual  Reality  Laboratory, Naval  Research 
Laboratory, Washington, DC, 20375 


5a.  CONTRACT  NUMBER 


5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

5d.  PROJECT  NUMBER 


5e.  TASK  NUMBER 


5f.  WORK  UNIT  NUMBER 

8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 


9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 


10.  SPONSOR/MONITOR'S  ACRONYM(S) 


11.  SPONSOR/MONITOR'S  REPORT 
NUMBER(S) 


12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited 

13.  SUPPLEMENTARY  NOTES 

Workshop  on  Commodity-Based  Visualization  Clusters,  IEEE  Visualization  2002,  October  27,  Boston, 
Massachusetts,  2002 


14.  ABSTRACT 


15.  SUBJECT  TERMS 


16.  SECURITY  CLASSIFICATION  OF: 

17.  LIMITATION  OF 
ABSTRACT 

18.  NUMBER 

OF  PAGES 

19a.  NAME  OF 

RESPONSIBLE  PERSON 

a.  REPORT 

unclassified 

b.  ABSTRACT 

unclassified 

c.  THIS  PAGE 

unclassified 

Same  as 
Report  (SAR) 

4 

Standard  Form  298  (Rev.  8-98) 

Prescribed  by  ANSI  Std  Z39-18 


Proceedings  of  Workshop  on  Commodity-Based  Visualization  Clusters, 
IEEE  Visualization  2002,  October  27,  Boston,  Massachusetts. 


Ethernet 

Switch 


100BaseT 


I/O  SO  M  SI 


Node  |Node|  [Node  Node 
6DOF 
Tracking 


Parallel 


Terminals 


i  i  i  » 


SGI 

RGB  Input 


Video 

Matrix  Switcher 


Genlock 


EZ3 


|  CRT 
I  »  «  » I  Projectors 


Infrared 

Emitters 


Fig.  1:  Commodity  Cluster  Configuration  Diagram 


upon  relatively  expensive  Myrinet  [4]  or  gigabit  network¬ 
ing  hardware. 

Master/Slave:  The  Master/Slave  approach  consists  of 
multiple  nodes,  where  each  node  of  the  graphics  cluster 
locally  stores  and  runs  an  identical  copy  of  the  graphics 
application.  Consequently,  only  a  small  amount  of  infor¬ 
mation  is  required  to  be  shared  among  the  nodes,  and 
network  bandwidth  becomes  less  of  a  concern.  This  in¬ 
formation  may  simply  include  input  device  data  and 
timestamps.  In  this  configuration,  the  master  node  han¬ 
dles  application  state  changes. 


Swap  Locking:  Swap  locking  is  the  process  of  synchro¬ 
nizing  the  frame  buffer  rendering  and  swapping.  This  is 
necessary  since  each  view  of  a  scene  contains  different 
amounts  of  data  and  numbers  of  polygons  to  render. 
These  may  produce  different  rendering  times  for  each 
frame  for  each  node 

Data  Locking:  Data  locking  is  the  process  of  synchro¬ 
nizing  the  views  to  maintain  consistency  across  the 
screens.  This  becomes  an  issue  since  each  node  in  the 
cluster  renders  its  frames  from  locally  stored  information. 


2.2  Graphics  Cluster  Requirements 

All  graphics  clusters  must  satisfy  three  requirements: 
genlocking,  swap  locking,  and  data  locking  [3].  These 
are  described  in  more  detail  below. 

Genlocking:  Genlocking  is  the  process  of  synchronizing 
the  video  frames  from  each  node  in  a  cluster  so  that  they 
produce  a  fluid,  coherent  image.  Genlocking  may  be 
achieved  through  software  or  hardware. 


2.3  Hardware  Overview 

We  used  a  set  of  standard  PC  configurations  equipped 
with  MSI  G4Ti4600  graphics  adapters  powered  by 
NVidia  Corp.'s  GeForce4  Ti  gpu  and  128Mb  of  DDR 
video  memory  [5].  Although  not  completely  necessary, 
the  PCs  were  identical,  which  made  software  installation 
easier.  The  PCs  communicated  via  100BaseT  network¬ 
ing  adapters  and  a  100BaseT  switch.  We  show  a  diagram 
of  the  complete  system  in  Figure  1 . 
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The  projectors  of  the  SSVR  are  connected  to  an  Ex- 
tron  CrossPoint  Plus  124  matrix  video  switcher  [6].  The 
switcher  is  capable  of  accepting  video  input  from  12 
sources  and  output  to  4  sources. 

Since  genlocking  and  data  locking  are  handled  in  soft¬ 
ware  through  the  parallel  ports,  a  special  box  (Fig.  2)  was 
fabricated  to  handle  the  signaling  appropriately.  This 
box  was  also  built  from  commercial  off-the-shelf  (COTS) 
hardware  for  less  than  $20.  In  addition,  this  box  also 
outputs  a  genlocking  signal  to  a  set  of  Crystal  Eyes  [7] 
infrared  emitters. 

3.  System  Implementation 

In  this  section  we  give  an  overview  of  the  software  and 
installation  used  to  create  the  cluster  environment. 

3.1  Software 

A  variety  of  software  is  used  to  deal  with  the  active  stereo 
and  scene  synchronization  needs  of  the  cluster.  The 
commodity  cluster  was  built  upon  a  standard  installation 
of  Red  Hat  Linux  7.2  [14]  and  the  kernel  was  then 
patched  to  include  the  Real-time  Application  Interface 
(RTAI,  [8]).  RTAI  allows  for  low  latency  and  task  com¬ 
pletion  timing,  which  can  be  determined  with  certainty. 

SoftGenLock  [9]  and  RTAI  are  used  in  concert  to  pro¬ 
vide  a  software  active  stereo  solution.  The  RTAI  kernel 
module  detects  the  vertical  refresh  of  the  monitor,  and 
changes  a  pointer  in  the  video  card  memory  that  tells  the 
video  card  what  to  draw  on  the  screen.  A  double  sized 
buffer  is  provided  to  the  application  by  specifying  a  vir¬ 
tual  frame  buffer  in  XFree86  [10],  which  is  twice  the  size 


of  the  actual  frame  buffer  being  drawn  to  the  screen.  For 
example,  if  displaying  at  a  resolution  of  1024  x  768  in 
active  stereo,  the  virtual  desktop  would  need  to  be  at  a 
resolution  2048  x  768.  The  RTAI  kernel  module  splits 
the  frame  buffer  in  half  and  alternatingly  displays  the 
scene  in  1024  x  768  pieces.  An  application  draws  in 
stereo  by  rendering  to  the  right  and  left  sides  of  the  X 
frame  buffer  for  the  right  and  left  eye. 

Genlock/data  lock  is  achieved  by  synchronizing  the 
machines  in  the  cluster  over  the  parallel  ports.  The 
RTAI  kernel  module  writes  to  one  pin  on  the  parallel 
port  and  reads  from  another  to  make  sure  that  the  other 
machines  in  the  cluster  have  completed  a  frame.  The 
master  tells  all  the  other  nodes  in  the  cluster  when  to 
draw  and  the  other  nodes  in  the  cluster  report  back  when 
they  are  ready  for  a  new  frame.  Data  lock  is  achieved  by 
making  sure  that  the  slave  machines  in  the  cluster  are  on 
the  correct  eye  when  the  parallel  port  is  in  a  certain  state. 

SoftGenLock  does  not  synchronize  applications  be¬ 
tween  the  nodes  of  a  cluster;  it  provides  data  lock  and 
stereo.  It  is  the  responsibility  of  the  application  to  syn¬ 
chronize  the  viewing  frustum  and  animations  in  the  ap¬ 
plication.  Lastly,  since  SoftGenLock  only  uses  VGA 
registers,  it  potentially  can  work  with  any  graphics  card. 

3.2  Installation 

When  installing  a  PC  cluster,  a  lot  of  issues  must  be  dealt 
with  before  setup.  Here  we  list  the  important  points  and 
give  some  recommendations  on  how  to  setup  the  system 
properly: 

•  Be  sure  to  have  adequate  HVAC,  power  and 
network  access  where  the  cluster  will  be  set  up. 

•  Set  up  the  cluster  and  display  system  in  separate 
rooms,  since  noise  levels  from  fans,  drives,  etc. 
may  be  distracting. 

•  Prepare  to  deal  with  a  lot  of  issues  associated 
with  laying  down  the  cables.  The  time  taken  at 
this  stage  to  label  and  check  the  cabling  for 
proper  size  and  lengths  will  save  time  later. 
This  will  alleviate  problems  with  signal  degra¬ 
dation  and  simply  losing  a  cable  in  the  “nest”. 

•  Create  a  master  power  switch  to  turn  off  all 
units  at  once. 

We  constructed  and  installed  a  3-wall  cluster  in  less  than 
two  weeks.  This  cluster  was  configured  to  use  active  ste¬ 
reo  and  has  the  ability  to  demonstrate  swap  locking  and 
data  locking. 
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4.  Testing  and  Evaluation 

We  tested  the  PC  commodity  cluster  by  implementing  a 
simple  visualization  software  system.  We  utilized  an  in¬ 
teroperable  software  architecture,  VR  Juggler  [11],  which 
provides  a  set  of  programming  abstractions  for  interfac¬ 
ing  with  a  variety  of  display,  tracking  and  computing 
systems,  and  a  variety  of  interaction  devices.  The  soft¬ 
ware,  which  is  used  for  visualizing  terrain  information, 
has  a  variety  of  display  modes  including  stereo.  It  can  be 
recompiled  to  work  on  different  computing  architectures, 
and  reconfigured  at  execution  time  by  including  different 
default  device  configuration  files. 

The  application  integrated  smoothly  with  the  cluster 
system  and  performed  better  than  expected.  The  quality 
of  the  displays  and  stereo  viewing  was  comparable  to  the 
same  software  running  on  three  walls  using  an  SGI  Onyx 
2  with  IR2  graphics.  However,  the  cluster  was  able  to 
visualize  the  data  with  better  performance.  Figure  3 
shows  the  cluster  software  running  in  the  4-wall  display 
system  on  the  three  side  walls. 

5.  Summary  and  Conclusions 

The  use  of  a  graphics  PC  cluster  is  now  becoming  a  vi¬ 
able  low-cost  alternative  to  the  use  of  single  large  visu¬ 
alization  supercomputers.  The  PCs  and  related  video 
hardware  are  fairly  cheap,  computationally  powerful,  and 
flexible.  There  is  also  an  abundance  of  software  avail¬ 
able  for  them,  such  as  VR  Juggler,  which  allow  for  appli¬ 
cations  to  interface  with  a  variety  of  display  systems  and 
interaction  devices  easily.  Our  cluster  experiment  ex¬ 
plored  the  feasibility  of  phasing  out  and  replacing  exist¬ 
ing  expensive  single  large  computing  hardware.  The  re¬ 


sults  show  we  can  make  this  kind  of  transition  in  the  near 
future,  and  we  believe  our  experiences  will  be  motivation 
for  others  to  follow  suit. 
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